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Abstract 


A  primary  motivation  for  reasoning  under  uncertainty  is  to  derive  decisions  in 
the  face  of  inconclusive  evidence.  Shafer’s  theory  of  belief  functions,  wliich  explicitly 
represents  the  underconstrained  nature  of  many  reasoning  problems,  lacks  a  formal 
procedure  for  making  decisions.  Clearly,  when  sufficient  information  is  not  available, 
no  theory  can  prescribe  actions  without  making  additional  assumptions.  Faced  with 
this  situation,  some  assumption  must  be  made  if  a  clearly  superior  choice  is  to  emerge. 
In  this  paijer  we  olfer  a  probabilistic  interpretation  of  a  simple  assumption  that  disam¬ 
biguates  decision  problems  represented  with  belief  functions.  We  prove  that  it  yields 
expected  values  identical  to  those  obtained  by  a  probabilistic  analysis  that  makes  the 
same  assumption.  We  maintain  a  strict  separation  between  evidence  that  carries  in¬ 
formation  about  a  situation  and  assumptions  that  may  be  made  for  disambiguation 
of  choices.  In  addition,  we  show  how  the  decision  analysis  methodology  frequently 
employed  in  probabilistic  reasoning  can  be  extended  for  use  with  belief  functions.  This 
generalization  of  decision  analysis  allows  the  use  of  belief  functions  within  the  famibar 
framework  of  decision  trees. 


KEYWORDS:  belief  functions,  decision  analysis,  decision-making,  decision  tree,  Dempster- 
Shafer  theory,  evidential  reasoning,  reasoning  under  uncertainty 
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1  Introduction 


Decision  analysis  provides  a  methodological  approach  for  making  decisions.  Uncertain  states 
of  nature  are  represented  by  probability  distributions,  and  each  possible  state  is  assigned  a 
value  or  utilily.  The  best  decision  is  the  one  that  yields  the  greatest  expected  uiility.  By 
enumerating  in  a  decision  tree  all  available  choices  and  assessing  the  probabilities  and  utilities 
of  tlie  states  of  nature  that  may  result,  one  can  mechanically  determine  the  optimal  sequence 
of  actions  he  should  take  [4,  8,  9,  14j. 

In  practice,  these  simple  requirements  are  hard  to  satisfy  [3].  Sometimes,  reliable  es¬ 
timates  of  the  probabilities  involved  are  hard  to  come  by.  For  example,  few  statistics  are 
a.vailable  for  determining  the  probability  of  a  nuclear  reactor  core  meltdown.  Assessing  the 
utility  of  many-faceted  states  of  nature  is  equally  challenging.  How  should  one  give  a  unique 
value  to  the  anticipated  quality  of  married  life?  These  limitations  have  hindered  the  more 
widespread  application  of  decision  analysis. 

Shafer’s  theory  of  belief  functions  [12,  15,  16,  18]  allows  one  to  express  partial  beliefs 
when  it  is  impossible  or  impractical  to  assess  complete  probability  distributions  confidently. 
Using  belief  functions,  one  can  bound  the  probabilities  of  events  for  which  the  assignment 
of  a  precise  probability  would  be  misleading.  The  theory  provides  a  facility  to  express  one’s 
beliefs  only  to  the  degree  to  which  there  is  supporting  evidence,  thereby  resulting  in  an 
appropriate  description  of  an  uncertain  event.  For  example,  there  might  be  reason  to  assign 
a  probability  to  a  reactor  malfunction,  without  saying  what  the  chance  is  that  it  may  lead 
to  a  core  meltdown. 

Despite  its  representational  advantages,  the  theory  of  belief  functions  lacks  a  formal 
basis  upon  which  decisions  can  be  made  in  the  face  of  ambiguity  [1],  Computing  the  ex¬ 
pected  utility  of  a  random  event  that  has  been  represented  with  belief  functions  results  in 
an  expected  utility  interval  (EUI).  To  choose  between  two  actions  one  must  compare  their 
respective  EUIs.  If  they  don’t  overlap,  the  choice  is  clear.  But  when  the  EUls  overlap,  the 
decision-maker  is  confronted  with  a  dilemma  —  the  available  evidence  does  not  support  ei¬ 
ther  choice.  Ideally,  one  should  collect  more  information  until  the  intervals  no  longer  overlap 
and  the  choice  becomes  clear.  However,  sometimes  one  is  forced  to  choose  without  benefit 
of  additional  information.  What  should  be  done? 

In  this  situation  there  is  no  recourse  except  to  make  an  assumption  to  eliminate  the  am¬ 
biguity.  Various  authors  have  expressed  preference  for  different  assumptions  (such  as  renor¬ 
malization,  generalized  insufficient  reason  [2,  20],  minimax  [22]  and  optimism/pessimism  [6]). 
More  elaborate  schemes  have  been  suggested,  but  they  also  amount  to  the  introduction  of 
unfounded  assumptions  [11,  13,  23].  Here  we  advocate  the  interpolation  of  a  point-valued 
utility  within  the  EUI.  We  make  no  claim  that  it  leads  to  superior  decisions,  but  do  claim 
that  it  is  no  less  viable  than  the  alternative  assumptions.  We  show  that  it  gives  the  same 
expected  utility  (and  hence  leads  to  the  same  decisions)  as  would  be  obtained  by  cissuming 
that  there  is  some  probability  that  ambiguity  will  be  resolved  in  one’s  favor. 

We  further  show  how  decision  analysis  can  he  generalized  to  accomodate  a  belief  function 
representation  of  uncertainty.  This  involves  two  modifications:  allowing  an  interval  as  the 
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of  a  state  or  set  of  states,  and  allowing  a  belief  function  in  place  of  a  probability 
distribution.  The  result  is  a  complete  decision  analysis  procedure  compatible  with  either 
probabilistic  or  belief  function  representations  of  uncertainty. 

We  should  point  out  that  decision  theory  (and  its  associated  utility  theory)  is  not  the 
only  approach  for  making  decisions  under  uncertainty.  For  example,  Lesh  has  proposed  a 
model  based  on  an  ignorance-preference  coefficient  that  is  empirically  derived  [10].  Shafer 
has  advocated  a  “constructive”  decision  theory  which  seeks  support  for  actions  that  acliieve 
goals  [17].  Loui  et.  al.  suggest  representing  beliefs  not  by  one  distribution,  but  b}'  a  se¬ 
quence  of  progressively  more  decisive  distributions  [11].  In  this  paper  we  are  concerned  with 
providing  for  the  use  of  belief  functions  within  the  general  framework  of  decision  analysis. 

It  is  worth  noting  that  none  of  the  material  described  in  this  paper  depends  on  the  use  of 
Dempster’s  rule,  which  is  commonly  used  in  Shafer’s  theory  to  combine  independent  bodies 
of  evidence  [16].  The  computation  of  expected  utility  interval,  and  the  procedure  for  using 
EUIs  in  decision  analysis,  only  requires  that  a  belief  function  representation  of  the  problem 
be  available.  Dempster’s  rule  could  be  used  to  construct  that  belief  function,  but  it  is  not 
required  for  decision  analysis. 

In  the  sections  that  follow  we  develop  the  theory  and  illustrate  its  use  with  simple  ex¬ 
amples.  In  Section  2  we  derive  the  expected  utility  interval  that  results  from  the  use  of 
belief  functions.  We  then  show  how  making  an  assumption  about  the  probabilit}''  of  nature’s 
cooperation  leads  to  the  same  expected  utility  as  interpolation  within  the  EDI.  In  Section  3, 
this  result  is  used  to  generalize  decision  analysis  and  is  illustrated  within  a  decision  problem 
concerning  whether  or  not  to  drill  for  oil.  We  conclude  with  a  discussion  of  the  benefits  and 
limitations  of  our  approach,  and  compare  its  use  with  other  approaches  to  decision-making 
under  uncertainty. 


2  Expected  Value 

Decision  analysis  provides  a  methodological  approach  for  making  decisions.  The  crux  of  the 
method  is  that  one  should  choose  the  action  that  will  maximize  the  expected  utility.  In  this 
section  we  revie\v  the  computation  of  expected  utility  using  a  probabilistic  representation  of 
a  simple  example  and  show  how  a  belief  function  gives  rise  to  a  range  of  expected  utilities. 
We  then  show  how  a  simple  assumption  about  the  inclination  of  nature  leads  to  a  means  for 
choosing  a  single-point  expected  utility  for  belief  functions. 

2.1  Expected  value  using  probabilities 

Example  -  Carnival  Wheel  #1  A  familiar  game  of  chance  is  the  carnival  wheel 
pictured  in  Figure  1.  This  wheel  is  divided  into  10  equal  sectors,  each  of  which  is 
labeled  with  a  dollar  amount  as  shown.  For  a  S6.00  fee,  the  player  gets  to  spin  the 
wheel  and  receives  the  amount  shown  in  the  sector  that  stops  at  the  top.  Should  we 
be  willing  to  play? 
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Figure  1:  Carnival  Wheel  #1 

The  analysis  of  this  problem  lends  itself  readily  to  a  probabilistic  representation.  From 
inspection  of  the  wheel  (assuming  each  sector  really  is  equally  likely),  we  can  construct  the 
following  probability  distribution: 


p(Sl) 

=  0.4 

p(S5) 

=  0.3 

p(SlO) 

=  0.2 

p(S20) 

=  0.1 

The  expected  value  E{x)  is  computed  from  the  formula 

E{x)  =  ^  (1) 

where  0  is  the  set  of  possible  outcomes.  The  expected  value  of  the  carnival  wheel  is  S5.90 
as  shown  here:  , _ _ 


X 

P{^) 

x-p{x) 

1 

0.4 

0.4 

5 

0.3 

1.5 

10 

0.2 

2.0 

20 

0.1 

2.0 

Elz)  = 

5.90 
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Figure  2:  Carnival  Wheel  7^2 

Therefore,  we  should  refuse  to  play,  because  the  expected  value  of  playing  the  game  is  less 
than  the  S6.00  cost  of  playing.^  Let  us  now  modify  the  problem  slightly  in  order  to  motivate 
a  belief  function  approach  to  the  problem. 

2.2  Expected  value  intervals 

Example  —  Carnival  Wheel  ^2  Another  carnival  wheel  Is  divided  into  10  equal 
sectors,  each  having  Si,  S5,  SlO,  or  S20  printed  on  it.  However,  one  of  the  sectors  is 
hidden  from  view.  How  mudi  are  we  willing  to  pay  to  play  this  game? 

This  problem  is  ideally  suited  to  an  analysis  using  belief  functions.  In  a  belief  function 
representation,  a  unit  of  belief  is  distributed  over  the  space  of  possible  outcomes  (commonly 
called  the /name  of  discernment).  Unlike  a  probability  distribution,  which  distributes  belief 
over  elements  of  the  outcome  space,  this  distribution  (called  a  mass  function)  attributes 
belief  to  subsets  of  the  outcome  space.  Belief  attributed  to  a  subset  signifies  that  there 
is  reason  to  iDelieve  that  the  outcome  will  be  among  the  elements  of  that  subset,  without 
committing  to  any  preference  among  those  elements.  Formally,  a  mass  distribution  m©  is  a 
mapping  from  subsets  of  a  frame  of  discernment  0  into  the  unit  interval: 

^  We  assume  that  the  monetary  value  is  directly  proportional  to  utility  because  of  the  small  dollar  amounts 
involved.  We  could  instead  have  chosen  to  work  with  utilities  to  account  for  nonlinearities  in  one’s  preferences 
for  money. 
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[0,1], 


me 


;  2 


0 


I— ^ 


such  thaf. 


me{(l})  =  0  and  ^  m0(/l,)  =  1  - 

/liC0 


Any  subset  to  which  nonzero  mass  has  been  attributed  is  called  a  focal  eleinent.  One  of  the 
ramifications  of  this  representation  is  that  the  belief  in  a  hypothesis  A  {A  C  0)  is  constrained 
to  lie  within  an  interval  [5p<(A),  Pls(y4)],  where 


Spi{A)  =  Y.  ^eiAi)  ;  Pls{A)  =  1  -  Spt{-^A)  .  (2) 

AiCA 


These  bounds  are  commonly  referred  to  as  support  and  plazisibiUty. 

The  frame  of  discernment  0  for  Wheel  is  Sl0,S20}.  The  mass  function  for 

Wheel  ^2  is  shown  below, 


m({Sl}) 

=  0.4 

m({S5}) 

=  0.2 

m({SlO}) 

=  0.2 

m({S20}) 

=  0.1 

m({Sl,S5,S]0,S20}) 

=  0.1, 

and  its  associated  belief  intervals  are 

[Spi{{%\]\Pls{{%l])] 

=  [0.4, 0.5] 

[Spi{{%b]\Pls{{%b])] 

=  [0.2,0..3] 

[Spt{{UQ]),pis{{m})] 

=  [0.2,0..3] 

[5pt({S20}),P/s({S20})] 

=  [0.1, 0.2] 

Before  we  can  compute  the  expected  value  of  the  wheel  represented  by  this  belief  function, 
we  must  somehow  assess  the  value  of  the  hidden  sector.  We  know  that  there  is  a  0.1  chance 
that  the  hidden  sector  will  be  selected,  but  what  value  should  we  attribute  to  that  sector? 
If  the  carnival  hawker  were  allowed  to  assign  a  dollar  value  to  that  sector,  he  would  surely 
have  assigned  Si.  On  the  other  hand  if  we  {or  a  cooperative  friend)  were  allowed  to  do  so,  it 
would  have  been  S20.  Any  other  assignment  method  would  result  in  a  value  between  Si  and 
S20,  inclusive.  Therefore,  if  we  truly  do  not  know  what  assignment  method  was  used,  the 
strongest  statement  that  we  can  make  is  that  the  value  of  the  hidden  sector  is  between  $1 
and  S20.  Using  interval  arithmetic  we  can  apply  the  expected  value  formula  of  Equation  1 
to  obtain  an  expected  value  interval  (EVI); 
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E(x)  =  [£.(i),  E-(x)] 


(3) 


wliere^ 

inf(A{)  ■  7nQ(A,) 

AiC0 

sup(/l,-)  •r77.0(/l,)  . 

/l.C© 

The  expected  value  interval  of  Wheel  7^2  is 

F(.t)  =  [0.4(1)  +  0.2(5)  +  0.2(10)  +  0.1(20)  +  0.1(1), 

0.4(1)  +  0.2(5)  +  0.2(10)  +  0.1(20)  +  0.1(20)] 

F(x)  =[5.50,7.40]. 

2.3  Expected  value  using  belief  functions 

As  many  researchers  have  pointed  out,  an  interval  of  expected  values  is  not  very  satisfactory 
when  we  have  to  make  a  decision.  Sometimes  it  provides  all  the  information  necessary  to 
make  a  decision,  e.g.  if  the  game  costs  S5  to  play,  then  clearly  we  should  be  willing  to  play 
regardless  of  who  gets  to  assign  a  value  to  the  hidden  sector.  Sometimes  u'e  can  defer  making 
the  decision  until  we  have  collected  more  evidence,  e.g.  if  we  could  peek  at  the  hidden  sector 
and  then  decide  whether  or  not  to  play.  But  the  need  to  make  a  decision  based  on  the 
currently  available  information  is  often  inescapable,  e.g.  should  we  spin  Wlieel  #2  for  a 
$6  fee?  We  will  present  our  methodology  for  decision-making  using  belief  functions  after 
pausing  to  consider  a  Bayesian  analysis  of  the  same  situation. 

If  we  are  to  use  the  probabilistic  definition  of  expected  value  from  Equation  1,  w'e  are 
forced  to  assess  probabilities  of  all  possible  outcomes.  To  do  this,  w'e  must  make  additional 
assumptions  before  proceeding  further.  One  possible  assumption  is  that  all  four  values 
of  the  hidden  sector  (Si,  S5,  SlO,  S20)  are  equally  likel}^,  and  we  could  evenly  distribute 
among  those  four  values  the  0.1  chance  that  the  hidden  sector  is  chosen.  This  is  an  example 
of  the  generalized  insufficient  reason  principle  advanced  by  Dubois  and  Pracle  [2]  and  by 
Smets  [20].  The  resulting  computation  of  expected  value  with  this  assumption  is  shown 
below;  the  expected  value  is  S6.30: 


X 

p(x) 

X  ■  p(x) 

1 

0.425 

0.425 

5 

0.225 

1.125 

10 

0.225 

2.250 

20 

0.125 

2.500 

F(x)  = 

6.30 

^We  use  inf(^,)  or  sup(j4,-)  to  denote  the  smallest  or  largest  element  in  the  set  C  0.  0  is  assumed  to 
be  a  set  of  scalar  values  [21]. 
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An  alternative  assumption  is  that  the  best  estimate  of  the  probabilitj'  distribution  for  the 
value  of  the  hidden  sector  is  the  same  cis  the  known  distribution  of  the  visible  sectors.  Using 
this  assumption,  the  result  is  S6.00: 


X 

p{x) 

X  ■  p(.r) 

1 

4/9 

4/9 

5 

2/9 

10/9 

10 

2/9 

20/9 

20 

1/9 

20/9 

E{x)  = 

6.00 

Rather  than  making  one  of  these  assumptions,  we  may  wish  to  parameterize  by  an  unknown 
probability  p  our  belief  that  either  we  get  to  choose  the  value  of  the  liidden  sector  or  the 
carnival  hawker  does.  Let  p  be  the  probability  that  the  value  assigned  to  the  hidden  sector 
is  the  one  that  we  would  have  cissigned,  if  given  the  opportunity,  so  (1  —  p)  is  the  probability 
that  the  carnival  hawker  chose  the  value  of  the  hidden  sector.  That  is, 

p(hidden  sector  is  labeled  S20)  =  p 
p(hidden  sector  is  labeled  Si)  =  \  —  p  . 

The  expected  value  of  Wheel  ^2  can  then  be  recomputed  using  probabilities  and  Equation  1 
as  illustrated  here:  _ _ _ 


X 

p(a;) 

X  ■  p(x) 

1 

0.4  +  0.1(l  -p) 

0.5  —  O.lp 

5 

0.2 

1.0 

10 

0.2 

2.0 

20 

0.1  +  o.lp 

2.0 -h2p 

B(x)  = 

5.50  +  1.90p 

To  decide  whether  to  play  the  game,  we  need  only  assess  the  probability  p.  For  the 
carnival  wheel  it  would  be  wise  to  allow  that  the  hawker  has  hidden  the  value  from  our  view; 
thus  we  might  assume  that  p  =  0.  So  E{x)  =  5.50,  and  we  should  not  be  willing  to  pay 
more  than  S5.50  to  spin  the  wheel. 

Example  —  Carnival  Wheel  #3  A  third  carnival  wheel  is  divided  into  10  equal 
sectors,  each  having  Si,  S5,  SlO,  or  S20  printed  on  it.  This  wheel  has  5  sectors  hidden 
from  view.  However,  we  do  know  that  none  of  these  sectors  is  a  S20,  that  the  first 
hidden  sector  is  either  a  S5  or  a  SlO,  and  that  the  second  hidden  sector  is  either  a  Si 
or  a  $10.  How  much  are  we  willing  to  pay  to  spin  Wheel  #3? 

A  probabilistic  analysis  of  Wheel  ^3  requires  one  to  make  additional  assumptions.  Esti¬ 
mating  the  conditional  probability  distribution  for  each  hidden  sector  would  provide  enough 
information  to  compute  the  expected  value  of  the  wheel.  Alternatively,  estimating  just  the 
expected  value  of  each  hidden  sector  would  suffice  cis  well.  However,  doing  so  can  be  both 
tedious  and  frustrating:  tedious  because  there  may  be  many  hidden  sectors,  and  frustrating 


Figure  3:  Carnival  Wheel  #3 

because  we’re  being  asked  to  provide  information  that,  in  actuality,  we  do  not  have.  (If  w'e 
knew  the  conditional  probalDilities  or  the  expected  values,  we  would  have  used  them  in  our 
original  analysis.)  What  is  the  minimum  information  necessary  to  establish  a  single  expected 
value  for  Wheel  #3? 

The  probability,  p,  that  we  used  to  analyze  Wheel  #2  can  be  used  here  as  well. 
Definition  1 

Let  p  =  the  probability  that  ambiguity  will  be  resolved  as  favorably  as  possible; 

(1  —  p)  =  the  probability  that  ambiguity  will  be  resolved  as  unfavorably  as  possible. 

Estimating  p  is  sufficient  to  restrict  the  expected  value  of  a  belief  function  to  a  single  point. 
It  is  easy  to  see  that  the  expected  value  derived  from  this  analysis  as  p  varies  from  0  to  1  is 
exactly  the  value  obtained  by  linear  interpolation  of  the  EVI  that  results  from  using  belief 
functions.  The  following  derivation  shows  that  this  is  true  in  general. 

Theorem  1  Given  a  mass  function  itiq  defined  over  a  scalar  frame  0  of  utilities,  and  mi 
estimate  of  p  (the  probability  that  all  residual  ambiguity  will  turn  out  favorably) ,  the  expected 
utility  given  m©  is 

E{x)^E.{x)^p-{E‘{x)-E,{x)).  (4) 

Proof: 

Consider  a  mass  function  m©  defined  over  a  frame  of  discernment  0.  Now  consider  any  focal 
element  /i  C  0,  such  that  mQ{A)  >  0.  Since  p  is  the  probability  that  a  cooperative  agent 
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will  control  which  x  6  A  will  be  selected,  and  (1  — /5)  is  the  probability  that  an  adversary  will 
be  in  control,  then  the  probability  that  x  will  be  chosen  given  that  focal  element  A  occurs  is 

{p  if  a-  =  sup{/l) 

(1  -  p)  if  X  =  inf(/L) 

0  otherwise. 

Considering  all  focal  elements  in  m©,  we  can  construct  a  probability  distribution  pe(x)  as 
follows: 

P0(-i^)=  X]  •P0(A) 

>4.C0 

P0(3;)  =  X]  p-me(A)  +  ^  (1  -  p)  •  777.0 (yl)  . 

A:  sup(>l)=3;  A:  inf(7l)=a: 

Using  Equation  1  we  have 


/ 

=  XZ-'^’’  XZ  p-m0(A)  +  XZ  (1  - /5)  ■  n7,Q(^) 

xS0  \/l:  sup(/l)=x  A:  in£(A)=T 

sup(A)  ■  p  ■  me(A)  A-  in{(A)  ■  (1  -  p)  ■  m.e(A) 

A:  sup(>l)=x  A:  inf(a)=^i' 

The  double  summations  can  be  collapsed  to  a  single  summation  because  every  AC© 
has  a  unique  sup(>l)  6  0  and  a  unique  inf(A)  €  0. 


-®(®)  =  XZ  +  inf(A)  -  (1  -  p)  •  777.0(^) 

/1C0 

=  Y^  inf(^)  •  m0(/l)  +  p  •  XZ  [sup(^)  -  inf(/l)]  •  rn.0(/l) 

/IC©  ^co 

=  E.(x)  +  p-(E’(x)-E,(x)y  . 


□ 

The  important  point  of  the  proof  is  that  the  probabilistic  analysis  provides  a  meaning¬ 
ful  way  to  choose  a  distinguished  point  within  an  EVI  that  results  from  the  use  of  belief 
functions.  That  distinguished  point  can  then  be  used  as  the  basis  for  comparison  of  several 
choices  when  their  respective  EVIs  overlap, 

2.4  Discussion 

Because  of  its  interval  representation  of  belief,  Shafer’s  theory  poses  difficulties  for  a  decision¬ 
maker  who  uses  it.  Lesh  has  proposed  a  diffei'ent  method  for  choosing  a  distinguished  point 
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to  use  in  the  ordering  of  overlapping  choices  [10].  Lesh  makes  use  of  an  empirically  derived 
“ignorance  preference  coefficient,*’  r,  that  is  used  to  compute  the  distinguished  point  called 
“expected  evidential  belief”  (EEB): 

EEB(A)  =  +  yi<A)-Spt{A)Y 

A  choice  is  made  by  choosing  the  action  that  maximizes  the  “expected  evidential  value”  (EEV); 

EEV  =  53  /I;  •  EEB{Ai)  . 

A,CQ 

There  are  some  important  differences  between  Lesh’s  approach  and  the  present  approach 
for  evidential  decision-making.  The  ignorance  preference  parameter  r  can  be  seen  as  a 
means  for  interpolating  a  distinguished  value  within  a  ieh'e/ interval  [5pt(/l),  P/s(A)],  while 
the  cooperation  probability,  p,  is  used  to  interpolate  within  an  interval  of  expected  utilities 
[P,(x),  £'*(.r)].  Secondly,  Lesh’s  parameter  r  is  empirically  derived  and  hcis  no  theoretical 
underpinning.  In  contr<ist,  the  cooperation  parameter  p  has  been  explained  as  a  probability 
of  a  comprehensible  event — that  the  residual  ambiguity  will  be  favorably  resolved.  It  leads 
to  a  simple  procedure  involving  linear  interpolation  between  bounds  of  expected  utility,  and 
is  derived  from  probability  theory. 

The  use  of  a  single  parameter  to  choose  a  value  between  two  extremes  is  similar  in  spirit  to 
the  approach  taken  by  Hurwicz  with  a  probabilistic  formulation  [6].  Hurwicz  suggested  that 
rather  than  computing  the  expected  utility  of  a  variable  for  which  a  probability  distribution 
is  known,  one  could  interpolate  a  decision  index  between  two  extremes  by  estimating  a  single 
parameter  related  to  the  disposition  of  nature.  When  this  parameter  is  zero,  one  obtains 
the  Wald  minimax  criterion — the  assumption  that  nature  will  act  as  strongly  as  possible 
against  the  decision-maker  [22],  In  contrast  to  the  Hurwicz  approach  in  which  one  ignores 
the  probability  distribution  and  computes  a  decision  index  on  the  basis  of  the  parameter 
only,  in  our  approach  the  expected  utility  interval  is  computed,  and  interpolation  between 
extremes  occurs  only  within  the  range  of  residual  ambiguity  allowed  by  the  focal  elements  of 
a  belief  function.  Thus  our  approach  is  identical  to  the  use  of  expected  utilities  when  a  prob¬ 
ability  distribution  is  available;  it  is  identical  to  Hurwicz’s  approach  when  there  are  known 
constraints  on  the  distribution;  and  it  combines  elements  of  both  when  the  distribution  is  a 
belief  function. 

There  may  be  circumstances  in  which  a  single  parameter  is  insufficient  to  capture  the 
underlying  structure  of  a  decision  problem.  In  these  cases  it  would  be  more  appropriate  to  use 
a  different  probability  to  represent  the  attitude  of  nature  for  each  source  of  ambiguity.  Let 
p;  be  the  probability  that  ambiguity  within  each  focal  element  Ai  will  be  decided  favorably, 
(Vy4,)j4,'  C  0.  Then  we  obtain 

Pi  [sup(A,-)  -  inf(A,-)]  •  m0(A,-)  (5) 

AiCQ  >1,C0 

in  place  of  Equation  4. 
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3  Decision  Analysis 

In  the  preceding  section  we  have  defined  the  concept  of  an  expected  utilit}'  interval  for  belief 
functions  and  we  have  shown  that  it  bounds  the  expected  utility  that  would  be  obtained 
with  any  probability  distribution  consistent  with  that  belief  function.  Furthennore,  we  have 
proposed  a  parameter  (the  probability  that  residual  ambiguity  will  be  decided  in  our  behalf) 
that  can  be  used  as  the  basis  for  computing  a  unique  expected  utility  when  the  available 
evidence  warrants  only  bounds  on  that  expected  utility.  In  this  section  we  will  show  how 
the  expected  utility  interval  can  be  used  to  generalize  probabilistic  decision  analysis. 

Decision  analysis  was  first  developed  as  a  means  by  which  one  could  organize  and  sys¬ 
tematize  one’s  thinking  when  confronted  with  an  important  and  difficult  choice  [4,  14].  Its 
formal  basis  has  made  it  adaptable  as  a  computational  procedure  by  which  computer  pro¬ 
grams  can  choose  actions  when  provided  with  all  relevant  information.  Simply  stated,  the 
analysis  of  a  decision  problem  under  uncertainty  entails  the  following  steps: 

•  List  the  viable  options  available  for  gathering  information,  for  experimentation,  and 
for  action. 

•  List  the  events  that  may  possibly  occur. 

•  Arrange  the  information  you  may  acquire  and  the  choices  you  may  make  in  chronolog¬ 
ical  order. 

•  Decide  the  value  to  you  of  the  consequences  that  result  from  the  various  courses  of 
action  open  to  you. 

•  Judge  the  chances  that  any  particular  uncertain  event  will  occur. 

3.1  Decision  analysis  using  probabilities 

First  we  will  illustrate  the  use  of  decision  analysis  on  a  problem  that  can  be  represented  with 
probabilities  to  acquaint  the  reader  with  the  method  and  terminology. 

Example  -  Oil  Drilling  A  wildcatter  must  decide  whether  or  not  to  drill  for 
oil.  He  is  uncertain  whether  the  hole  will  be  dry,  have  a  trickle  of  oil,  or  be  a  gusher. 
Drilling  a  hole  costs  $70,000.  The  payoffs  for  hitting  a  gusher,  a  trickle,  or  a  dry  hole 
are  $270,000,  $120,000,  and  $0,  respectively.  At  a  cost  of  $10,000  the  wildcatter  could 
take  seismic  soundings  that  would  help  determine  the  underlying  geologic  structure. 

The  soundings  will  determine  whether  the  terrain  has  no  structure,  open  structure, 
or  closed  structure.  The  experts  have  provided  us  with  the  joint  probabilities  shown 
below.  We  are  to  determine  the  optimal  strategy  for  experimentation  and  action  [8]. 


State 

No  struct 

Open 

Closed 

Marginal 

■jlU 

0.50 

BH 

0.30 

0.08 

0.10 

0.20 

Marginal 

0.41 

0.35 

0.24 

1.00 
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In  decision  anah'sis,  a  decision  tree  is  constructed  that  captures  the  chronological  order 
of  actions  and  events  [S,  9].  A  square  is  used  to  represent  a  decision  to  be  made,  and  its 
branches  are  labeled  witli  the  alternative  choices.  A  circle  is  used  to  re]Dresent  a  chance  node, 
and  its  branches  are  labeled  with  the  conditional  probability  of  each  event,  given  that  the 
choices  and  events  along  the  path  leading  to  the  node  have  occurred. 

To  compute  the  best  strategy,  the  tree  is  e\'aluated  from  its  leaves  toward  its  root. 

•  The  value  of  a  leaf  node  is  the  utility  of  the  state  of  nature  it  represents. 

•  The  value  of  a  chance  node  is  the  expected  utility  of  the  probability  distribution  rep¬ 
resented  by  its  branches  as  computed  using  Equation  1. 

•  The  value  of  a  choice  node  is  the  maximum  of  the  utilities  of  each  of  its  sons.  The 
best  choice  for  the  node  is  denoted  by  the  branch  leading  to  the  son  with  the  greatest 
utility.  Ties  are  broken  arbitrarily. 

This  procedure  is  repeated  until  the  root  node  has  been  evaluated.  The  value  of  the  root 
node  is  the  expected  utility  of  the  decision  problem;  the  branches  corresponding  to  the 
maximal  value  at  each  choice  node  give  the  best  sii'aiegy  to  follow  (i.e.  choices  to  make  in 
each  situation). 

The  evaluated  decision  tree  for  the  oil  drilling  example  is  portrayed  in  Figure  4.  It  can 
be  seen  that  the  expected  value  is  S22,500  and  that  the  best  strategy  is  to  take  seismic 
soundings,  to  drill  for  oil  if  the  soundings  indicate  open  or  closed  structure,  and  not  to  drill 
if  the  soundings  indicate  no  structure. 

3.2  Decision  analysis  using  belief  functions 

To  use  the  decision  procedure  just  described,  it  must  be  possible  to  assess  the  probabilities 
of  all  uncertain  events.  That  is,  the  set  of  branches  emanating  from  each  chance  node  in  the 
decision  tree  must  depict  a  probability  distribution.  In  many  scenarios,  however,  estimating 
these  probability  distributions  is  difficult  or  impossible,  and  the  decision-maker  is  forced  to 
assign  pi'obabilities  even  though  he  knows  they  are  unreliable.  Using  belief  functions,  one 
need  not  estimate  any  probabilities  that  are  not  readily  available.  The  representation  better 
reflects  the  evidence  at  hand,  but  the  decision  analysis  procedure  cannot  be  used  with  the 
resulting  interval  representation  of  belief.  In  this  section  we  describe  a  generalization  of 
decision  anabasis  that  accommodates  belief  functions. 

Example  -  Oil  Drilling  #2  As  in  the  first  oil-driUiiig  example,  a  wildcatter 
must  decide  whether  or  not  to  drill  for  oil.  His  costs  and  payoffs  are  the  same  as  before: 
drilling  costs  $70,000,  and  the  payoffs  for  hitting  a  gusher,  a  trickle,  or  a  dry  well  are 
$270,000,  $120,000,  and  $0,  respectively.  However,  at  this  site,  no  seismic  soundings 
are  available.  Instead,  at  a  cost  of  $10,000,  the  wildcatter  can  make  an  electronic  test 
that  is  related  to  the  well  capacity  as  shown  below.  We  are  to  determine  the  optimal 
strategy  for  experimentation  and  action. 
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Prob 

Test  result 

Capacity 

0.5 

red 

dry 

0.2 

yellow 

dry  or  trickle 

0.3 

green 

trickle  or  gusher 

Several  issues  arise  that  prevent  one  from  constructing  a  well-formed  decision  tree  for 
this  example.  First,  consider  the  branch  of  the  tree  in  which  the  test  is  conducted  and  the 
result  is  green  (Figure  5).  If  we  drill  for  oil,  then  we  know  we  will  find  either  a  trickle  or  a 
gusher,  but  we  cannot  determine  the  probability  of  either  from  the  given  information.  We 
are  tempted  to  label  the  branch  with  the  disjunction  (Trickle  V  Gusher)  with  probability  1.0. 
But  what  should  be  the  payoff  of  that  branch?  All  we  can  say  is  that  the  payoff  will  be  either 
$40,000  (if  a  trickle)  or  $190,000  (if  a  gusher).  Ordinary  decision  analysis  requires  a  unique 
value  to  be  assigned,  but  we  have  no  basis  for  computing  one.  So  the  first  modification 
we  make  to  the  construction  of  decision  trees  is  to  allow'  disjunctions  of  events  on  branches 
emanating  from  chance  nodes,  and  to  allow  intervals  as  the  payoffs  for  leaf  nodes.  We  will 
discuss  later  how  to  evaluate  such  a  tree. 

To  see  the  second  issue,  consider  the  branch  of  the  tree  in  wdiich  the  test  is  not  conducted. 
If  we  drill  for  oil,  there  is  a  chance  that  we  will  hit  a  gusher,  a  trickle,  or  a  dry  well,  but 
what  is  the  probability  distribution?  We  know'  only  that 


p(Dry  I  Red)  =  1.0  p(Red)  =  0.5 

p(Dry  V  Trickle  |  Yellow)  =  1.0  p(Yellow')  =  0.2 

p(Trickle  V  Gusher  |  Green)  =  1.0  p(Green)  =  0.3 


There  is  not  enough  information  to  use  Bayes’  rule  to  compute  the  probability  distribution 
for  the  w'ell  capacity.  Without  adding  a  new  assumption  at  this  point,  the  strongest  statement 
that  can  be  made  is 

0.5  <  p(Dry)  <  0.7 

0.0  <  p(Trickle)  <  0.5 

0.0  <  p(Gusher)  <  0.3  . 

Using  belief  functions,  this  can  be  represented  as 

m({Dry})  =  0.5 
m({Dry,  Trickle})  =  0.2 
m({Trickle,  Gusher})  =  0.3, 

which  yields  the  required  belief  intervals 


[Spt{{'Diy]),Pls{{Dry})]  =  (0.5, 0.7] 
[5'pf({Trickle}),  Pis({Trickle})]  =  [0.0, 0.5] 
[5’pi({Gusher}),Pis({ Gusher})]  =  [0.0, 0.3] 


The  second  modification  we  make  to  decision  trees  is  to  allow  the  branches  emanating  from  a 
chance  node  to  represent  a  mass  function.  The  masses  must  still  sum  to  one,  but  the  events 
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Figure  5:  Modified  Decision  Tree  for  the  Second  Oil  Drilling  Example 
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need  not  be  disjoint.^  The  completed  decision  tree  for  Oil  Drilling  Example  is  shown  in 
Figure  5. 

The  tools  of  Section  2  can  be  used  to  evaluate  a  decision  tree  modified  in  this  manner. 

•  The  value  of  a  leaf  node  is  the  utility  of  the  state  of  nature  it  represents.  This  may  be 
a  unique  value  or,  in  the  case  of  a  disjunction  of  states,  an  interval  of  values. 

•  A  chance  node  repi-esents  a  belief  function.  Its  value  is  the  expected  utility  interval 
computed  with  Equation  3  : 

BW  =  [E.{x)..  E-{x)] . 

•  A  decision  node  represents  a  choice  of  the  several  branches  emanating  from  it.  The 
utility  of  each  branch  may  be  a  point  value  or  an  interval.  The  value  of  a  decision  node 
is  the  expected  utility  computed  using  Equation  4  and  an  estimate  of  p  : 

E(x)  =  E.{x)^f(E-{x)-E.(x)). 

The  action  on  the  branch  that  yields  the  greatest  E{x)  is  chosen.  Ties  are  broken 
arbitrarily. 

In  summar}',  a  decision  tree  and  decision  analysis  procedure  for  belief  functions  have 
been  described.  Two  modifications  were  made  to  adapt  ordinary  decision  trees;  intervals  are 
allowed  where  utilities  occur;  and  belief  functions  are  allowed  where  probability  distributions 
occur.  A  unique  strategy  can  be  obtained  by  estimating  the  probability  p. 

3.3  Generalized  decision  tree  examples 

Figures  6,  7,  and  8  show  the  evaluated  decision  tree  for  several  values  of  p  —  each  node  is 
labeled  with  its  expected  value  or  expected  value  interval.  In  the  cases  w'here  the  expected 
value  is  an  interval,  the  evidential  expected  value  E'(.t)  is  also  shown  (using  the  assumed  p). 
Preferred  decisions  are  highlighted  with  a  black  background. 

If  we  opt  not  to  test,  then  our  choice  is  either  to  not  drill  (expected  value  0)  or  to  drill 
(expected  value  interval  [—34,000  35,000]  ).  The  better  choice  depends  on  what  value  of 
p  is  assumed.  As  can  be  seen  in  the  figures,  if  p  =  0.0,  then  it  is  better  to  not  drill,  but  if 
p  =  0.5  or  p  =  1.0,  then  drilling  is  the  better  choice. 

If  we  choose  to  test  and  the  result  is  yellow,  then  our  choice  is  to  not  drill  (expected  value 
-10,000)  or  to  drill  (expected  value  interval  [—80,000  40,000]  ).  In  this  case  it  is  better  to 
not  drill  if  either  p  =  0.0  or  p  =  0.5  and  to  drill  if  p  =  1.0. 

If  the  test  result  is  red,  then  one  should  not  drill  regardless  of  p  (-10,000  is  always  better 
than  -80,000).  If  the  test  result  is  green,  then  one  should  always  drill  (-10,000  is  never  as 
good  as  the  interval  [40,000  190,000]  ). 

^Recall  that  a  probability  distribution  is  an  assignment  of  belief  over  mutually  exclusive  elements  of  a 
set,  whereas  a  mass  function  is  a  distribution  over  possibly  overlapping  subsets. 

'‘When  all  utilities  are  point-valued  and  all  belief  functions  are  true  probability  distributions,  no  assump¬ 
tion  is  required  and  the  strategy  will  be  identical  to  that  prescribed  by  ordinary  decision  analysis. 
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Figure  7;  Decision  Tree  for  the  Second  Oil  Drilling  Example  (assuming  p  ~  0.5). 


3.4  Comparing  two  choices 

Instead  of  assuming  a  value  for  p  first,  and  calculating  the  choices  that  result,  one  may  ask 
the  reverse  question.  At  what  value  of  p  would  I  change  my  decision?  This  can  be  answered 
in  general  by  examining  a  choice  between  two  states  having  expected  utility  intervals. 

Theorem  2  Let  the  expected  utility  intervals  of  two  choices  be  ns  follows: 

Choice  1:  [£'],(.r), 

Choice  2:  E^lx)]  . 

Assume  without  loss  of  generality  that  Choice  1  has  the  smaller  interval,  i.e. 

{E2{x)  —  E2.{x))  >  {E*{x)  —  Ei»{x)).  Then  Choice  2  fs  preferred  over  Choice  1  iff 

E,.[x)  -  E,.(x) 

Ei(x)  -  BUx)  +  Ei.{_x)  -  Ex.(x)  ■ 


Proof: 

Using  Theorem  1  and  solving  for  p  gives  the  point  pc  at  which  one  is  indifferent  between 
Choice  1  and  Choice  2: 


ii'i(.T)  =  E^»{x)  +  p  ■  {El{x)  - 

E2[x)  =  E2.[x)  +  P  ■  {E2{x)  -  E2.{x)) 

^  _ E}.{x)  -  E2.{x) _ 

{E2i^)  -  E;{x))  +  {E,.{x)  -  E2,{x))  ■ 
The  expected  value  of  both  choices  at  pc  is 

^  Ex.{x)  •  E\{x)  -  E;{x)  ■  E2.{x) 

''  EU^)  -  E2.[^)  +  E;{x)  -  E{{x)  • 

Now  consider  the  choice  at  p  =  pc  +  5  where  (5  >  0: 

E,{x)  =  Ec{x)  +  5-{E\{x)-E,.{x)) 
E2{x)  =  Ec{x)  +  6-{E*^{x)-E2.{x)). 


(7) 


(8) 

(9) 


Since  {E2{x)~E2»{x))  >  {Ei{x)  —  Ei,{x))  and  6  >  0,  it  must  be  the  case  that  E2{x)  >  Ei{x). 
Therefore,  Choice  2  is  preferred.  Similar  argument  shows  that  Choice  1  is  preferred  whenever 

p  <  pc- 

□ 

Letting 

a  =  Ei,{x)  —  E2,.{x)  and  b  =  ~ 


gives 


Pc  = 


a 

a  +  6 


(10) 
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Thus,  Choice  1  is  preferable  if 


and  Choice  2  is  preferable  if 


If  ^  >  1.0  then  Choice  1  is  always  preferred  (no  assumption  of  p  is  necessary).  If  <  0.0 
then  Choice  2  is  always  preferred.  It  follows  that  whenever  one  EUI  is  slightly  “higher”  than 
another,  i.e. 

i?i.(.r)  >  J?2-.(a-’)  a-nd  El{x)  >  , 

then  the  action  that  gives  rise  to  it  is  always  preferred. 

Returning  to  the  second  oil-drilling  example  (Figure  5),  the  decision  of  whether  or  not 
to  drill  when  the  test  result  is  yellow  involves  a  choice  between 

No  Drill;  E{x)  =  [-10,000  -  10,000] 

Drill;  e{x)  =  [-80,000  40,000]  . 

By  Theorem  2,  =  0.583,  and  one  should  drill  only  if  p  >  0.583. 

When  p  >  0.583,  the  decision  as  to  whether  or  not  to  conduct  the  test  involves  a  choice 
between 


P  < 


a  +  b  ' 


P  > 


a  +  b 


No  Test;  E{x)  =  [-34,000  35,000] 
Test;  eIx)  =  [  -9,000  60,000]  . 

Here,  Test  is  the  preferred  choice  because  its  EDI  is  higher. 


4  Discussion 

The  value  of  the  result  of  an  action  is  frequently  measured  in  money  (e.g.,  in  dollars),  but 
people  often  exhibit  preferences  that  are  not  consistent  with  maximization  of  expected  mon¬ 
etary  value.  The  theory  of  utility  accounts  for  this  behavior  by  associating  for  an  individual 
decision-maker  a  value  (measured  in  utiles)  with  each  state  s,  u  =  /(-s),  such  that  max¬ 
imization  of  expected  utility  yields  choices  consistent  with  that  individual’s  behavior  [4]. 
Utility  theory  can  satisfactorily  account  for  a  person’s  willingness  to  expose  himself  to  risk 
and  should  be  used  whenever  one’s  preferences  are  not  linearly  related  to  value.  This  atti¬ 
tude  toward  risk  should  not  be  confused  with  one’s  attitude  towards  ambiguity,  which  is  the 
quality  that  is  modeled  by  p. 

4.1  On  making  assumptions 

It  is  interesting  to  compare  the  types  of  assumptions  made  in  a  probabilistic  analysis  with  the 
p  assumption  proposed  here  for  belief  functions.  When  using  probability,  a  maximum  entropy 
assumption  is  often  made.  Sometimes,  this  a.ssumption  is  justified,  and  it  should  properly  be 
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considered  part,  of  the  evidence,  not  as  an  assumption.  When  this  is  the  case,  a  maximum 
entropj'  belief  function  can  be  used  as  well  [2].  At  other  limes,  the  maximum  entropj' 
assumption  is  not  justified,  but  is  used  simply  because  some  assumption  must  be  made,  and 
maximum  entropy  has  some  desirable  properties  [20].  In  these  cases,  the  choice  of  elements 
in  the  sample  space  (the  set  of  possibilities)  introduces  distortion  into  the  expected  value 
that  will  result.  That  is,  adding  a  few  more  possibilities  into  the  sample  space  will  change  the 
expected  value  of  the  maximum  entropy  distribution  over  that  sample  space.  For  example, 
if  we  choose  to  allow  for  the  possibility  of  S2  being  among  the  possibilities  for  the  hidden 
sector  of  Carnival  Wheel  ^2,  the  sample  space  would  be  {1,2,5,10,20}  instead  of  {1,5,10,20}, 
and  the  expected  value  of  the  maximum  entropy  distriliution  of  that  wheel  would  be  S6.16 
instead  of  $6.30.  On  the  other  hand,  for  any  choice  of  p,  the  evidential  expected  value  using 
either  of  the  two  proceeding  sample  spaces  wmuld  be  identically  (5.50  +  l.OOp)  dollars.  Of 
course,  adding  possibilities  outside  the  interval  [1,  20]  would  change  the  evidential  expected 
value.  For  example,  allowing  for  the  possibility  of  $50  in  the  hidden  sector  would  change  the 
maximum  entropy  expected  value  to  $7.12  and  w'ould  change  the  evidential  expected  value 
to  (5.50  + 4. 90p)  dollars.  The  point  is  that  both  assumptions  introduce  bias  into  the  decision 
criteria.  This  should  not  be  surprising  because  both  are  unjustified  assumptions.  There  is 
no  btisis  on  which  to  prefer  one  over  the  other;  both  <issumptions  are  entirely  plausible. 

Having  made  this  point,  there  are  some  consequently  weak  arguments  for  recommending 
the  use  of  the  assumption  of  the  probability  of  nature’s  cooperation  p.  Because  the  EUI 
spans  the  range  of  all  expected  utilities  that  could  be  obtained  by  adding  any  assumption 
to  a  probabilistic  analysis,  there  always  exists  some  value  of  p,  0  <  p  <  1  that  yields  the 
same  expected  utility  as  a  probabilistic  analysis.  Therefore,  the  decisions  that  are 

prescribed  depend  only  on  one’s  ability  to  estimate  p,  not  on  his  election  to  use  Equation  3. 
Furthermore,  the  use  of  a  single  parameter  means  that  the  decision-maker  is  asked  to  provide 
only  one  additional  piece  of  information. 

The  parameter  p  has  been  explained  as  a  probability,  giving  it  a  formal  grounding  that 
earlier  decision  schemes  for  belief  functions  have  lacked.  Furthermore,  we  believe  that  it  is 
the  probability  of  a  meaningful  event.  Selecting  p  =  0  is  appropriate  when  an  adversary 
controls  the  situation  (as  in  game  playing,  for  example)  or  when  a  decision-maker  wishes 
only  to  minimize  his  expected  loss,  and  is  equivalent  to  the  maximin  criteria  of  Wald.  An 
optimistic  decision-maker  would  prefer  to  choose  p  =  1  to  maximize  his  chance  of  realizing 
the  greatest  possible  expected  payoff  without  worrying  about  what  losses  might  be  possible. 
Intermediate  values  of  p  can  be  used  to  compromise  between  these  extremes. 

4.2  On  the  limitations  of  the  approach 

Despite  the  appeal  of  a  computationally  efficient  decision  analysis  procedure  for  belief  func¬ 
tions,  there  remain  some  issues  that  are  not  addressed.  As  in  classical  decision  analysis,  it 
remains  necessary  to  enumerate  the  potential  states  of  nature  and  to  assign  utilities  (actually 
utility  intervals,  which  should  be  easier  to  assign  in  practice).  This  task  can  be  overwhelming 
when  complex  scenarios  are  considered.  Furthermore,  it  should  not  be  forgotten  that  the 
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assignment  of  a  value  to  p  (when  it  is  necessary)  remains  an  assumption  unwarranted  by  the 
evidence  at  hand,  just  as  maximum  entropy  or  any  other  assumi^tion  is  unwarranted  when 
insufficient  information  is  available. 

It  is  inherent  in  the  methodology  described  that  the  determination  of  what  is  best  or 
worst  is  considered  after  the  decision-maker’s  choice  is  postulated.  That  is,  the  reaction  of 
nature  is  allowed  to  depend  on  the  decision  that  is  to  be  taken.  This  is  sometimes  reasonable, 
and  sometimes  not.  For  example,  conducting  a  regional  test  mar]<et  for  a  new  product  may 
affect  the  national  demand  by  virtue  of  publicity  or  increeised  competition.  .As  a  result,  there 
may  be  no  single  underlying  probability  distribution  that  can  simultaneously  give  rise  to  the 
expected  utilities  obtained  for  each  choice.  This  should  not  be  particularly  worrisome  as 
long  as  this  consideration  suits  the  problem  at  hand.  If  not,  the  expected  utility  intervals 
computed  with  the  method  described  here  may  be  wider  than  prescribed  by  the  evidence.  In 
that  case,  it  is  necessary  to  conduct  a  more  complicated  case-based  analysis  that  is  analogous 
to  the  linear-programming  problems  that  arise  in  game  theory.  See  Jaffray  [7]  for  further 
discussion  of  this  approach. 

4.3  On  the  automation  of  decision  analysis 

A  probabilistic  analysis  of  a  decision  problem  (e.g.  the  second  oil-drilling  example)  follows 
the  paradigm:  assess,  assume,  combine,  decide.  An  assessment  of  a  probability  distribution 
is  made  for  each  piece  of  evidence;  assumptions  are  made  about  the  distributions  of  missing 
pieces  of  evidence;  the  assumptions  cind  evidence  are  combined  to  obtain  a  distribution 
of  payoffs,  and  a  decision  is  made  on  the  basis  of  the  expected  utility  of  the  payoff.  In 
contrast,  a  belief  function  analysis  follows  the  paradigm:  assess,  combine,  assume,  decide. 
An  assessment  of  a  belief  function  is  made  for  each  piece  of  evidence;  these  pieces  of  evidence 
are  then  combined  to  obtain  a  belief  function  over  the  possible  payoffs;  then  an  ass\imption 
is  made  (about  the  benevolence  of  nature);  and  a  decision  is  made  using  that  assumption 
and  the  expected  utility  interval  of  the  payoffs. 

While  the  same  decisions  will  be  reached  whether  one  makes  assumptions  first  and  then 
combines  or  combines  evidence  and  then  adds  those  assumptions,  the  difference  in  paradigms 
has  important  implications  for  automating  the  procedure.  First,  in  some  decision  problems 
the  EUl  of  the  top  choice  will  not  overlap  the  EUl  of  any  other  choice,  i.e.  the  decision  fol¬ 
lows  from  what  is  truly  known,  and  in  no  way  depends  upon  the  accuracy  of  any  assumption 
that  might  be  made.  Using  belief  functions,  the  best  decision  in  this  case  is  immediately 
determinable  without  additional  assumptions.  Because  Bayes’  rule  requires  a  prior  distri¬ 
bution,  this  situation  cannot  be  recognized  without  a  more  complex  sensitivity  analysis 
when  a  purely  probabilistic  representation  is  used.  Second,  when  an  assumption  must  be 
made  because  intervals  do  overlap,  making  it  eis  late  as  possible  allows  one  to  maintain  the 
assumption-free  intermediate  calculations  for  use  in  other  computations.  This  is  not  an  issue 
when  the  evidence  will  be  used  once  and  discarded,  but  affords  a  considerable  computational 
savings  when  other  decisions  must  be  based  on  the  original  evidence  plus  new  evidence  as 
it  comes  along.  Third,  consider  what  must  be  computed  if  one  chooses  to  use  a  different 
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assumption  (as  needed  for  sensitivity  analysis,  for  example).  In  a  probabilistic  analysis  the 
assumptions  and  all  evidence  must  be  recombined  before  a  decision  can  be  made  because  the 
cLSSumptions  are  needed  to  combine  the  evidence.  Using  belief  functions,  one  need  only  com¬ 
bine  the  new  assumption  with  the  already  combined  evidence  before  selecting  the  decision. 
This  separation  of  evidence  and  assumption  is  similar  in  spirit  to  the  distinction  between 
credal  and  pignistic  beliefs  described  by  Smets  [19]. 


5  Summary 

We  have  proposed  a  decision  analysis  methodology  for  Shafer’s  theory  of  belief  functions. 
We  started  by  defining  the  notion  of  expected  utility  interval  (EUI)  and  showed  it  to  prop¬ 
erly  bound  the  expected  utility  of  any  probability  distribution  that  could  be  obtained  by 
introducing  additional  assumptions.  Because  an  expected  utility  interval  is  often  insufficient 
for  decision-making,  we  recognize  that  a  point- value  must  be  chosen  to  compare  alternative 
choices.  We  then  showed  how  a  linear  interpolation  of  a  distinguished  value  within  the  EUI 
is  equivalent  to  making  an  assumption  of  the  benevolence  or  maleficence  of  nature.  Letting 
p  be  the  probability  that  ambiguity  will  be  resolved  favorably,  we  derived  that  distinguished 
point. 

We  have  also  shown  how  the  theory  can  be  used  to  generalize  the  decision  trees  used 
in  probabilistic  decision  analysis.  These  tools  allow  a  decision-maker  to  defer  unwarranted 
assumptions  until  the  latest  possible  moment.  In  so  doing  he  can  sometimes  avoid  making 
any  assumptions  at  all.  Otherwise,  he  is  forced  to  provide  only  enough  additional  information 
to  allow  a  clear  choice,  and  has  the  benefit  of  all  available  information  to  selectively  decide 
where  he  would  like  to  make  that  assumption. 

We  have  implemented  the  techniques  and  have  used  that  software  to  generate  the  decision 
trees  shown  in  the  figures  in  this  paper.  In  addition  a  new  evidential  operator  for  decision¬ 
making  has  been  added  to  the  repertoire  of  the  evidential  reasoning  technology  developed  at 
SRI  International  (see  Appendix  B).  Decision  analysis  has  been  incorporated  into  Gister,^ 
SRI’s  evidential  reasoning  system  which  uses  the  Dempster-Shafer  theory  of  belief  functions 
as  its  underlying  representation. 

What  we  have  described  is  by  no  means  a  full  theory  of  decision-making  for  belief  func¬ 
tions.  Rather,  we  hope  it  may  provide  some  insight  that  will  someday  lead  to  a  better 
understanding  of  decision-making  with  incomplete  information. 
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A  Notation 


pe{x)  —  ProbabiliL}'  distribution  over  sample  space  Q,x  €  0 

77re(/l)  —Mass  function  defined  over  frame  of  discernment  0,  A  C 

0 

5pi(/l) -Support:  Spt{A)  = 

P/s(/l)  — Plausibility:  P/s(y4)  =  1  — 

E{x)  —Expected  value  of  a  random  variable  whose  outcome  is 
governed  by  a  probability  distribution: 

E{x)  = 

x£0 

—  Evidential  expected  value  -  the  expected  value  of  a  vari¬ 
able  governed  by  a  belief  function  assuming  that  any 
residual  ambiguity  will  be  decided  favorably  with  proba¬ 
bility  p\ 

E(x)  =  (l-p)  ■£.():)  + p-E-(x) 

E’‘{x)  —Upper  bound  of  expected  value: 

E'{x)  =  sup(ri,)  •  me(Ai) 

a.ce 

E»{x)  —Lower  bound  of  expected  value: 

E^ix)  =  inf(A,)  ■  m©(A,) 

AiCe 

EVI  —Expected  value  interval:  [£l*(a:),  El‘(a;)] 

EUl  —Expected  utility  interval:  Same  as  EVI,  when  0  is  a 
frame  of  utilities 

p  —The  probability  that  any  residual  ambiguity  will  be  de¬ 
cided  favorably 

1-/3  —The  probability  that  any  residual  ambiguity  will  be  de¬ 
cided  unfavorably 

Pc  —The  value  of  p  at  which  one  would  be  indifferent  between 
two  choices 
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B  Decision-Making  with  Evidential  Reasoning 

In  tin's  section  vve  reanalyze  the  second  oil-drilling  example  within  the  framework  of  evidential 
reasoning.  First  we  review  some  of  the  tools  of  evidential  reasoning  and  then  introduce  a 
decision  operator  for  belief  functions  leased  on  the  theory  described  earlier. 

In  evidential  reasoning,  domain-specific  knowdedge  is  defined  in  terms  of  compatibility 
relaiiojis  that  relate  one  frame  of  discernment  to  another.  A  compatibility  relation  simply 
describes  which  elements  from  the  two  frames  can  simultaneously  be  true.  A  compatibility 
relation  0..i,a  between  two  frames  0yi  and  0b  is  a  set  of  pairs  such  that 

Qa,b  S  0>t  X  0b  , 

where  every  element  of  0^  and  every  element  of  0b  is  included  in  at  least  one  pair. 

Evidential  reasoning  provides  a  number  of  formal  operations  for  assessing  evidence,  in¬ 
cluding: 

•  Fusion  —  to  determine  a  consensus  from  several  bodies  of  evidence  obtained  from 
independent  sources.  Fusion  is  accomplished  through  Dempster’s  rule  of  combination: 

=  E  (H) 

AiriAj=(^ 

Dempster’s  Rule  is  both  commutative  and  associative  (meaning  evidence  can  be  fused 
in  any  order)  and  has  the  effect  of  focusing  belief  on  those  propositions  that  are  held 
in  common. 

•  Ti’anslation  —  to  determine  the  impact  of  a  body  of  evidence  upon  elements  of  a 
related  frame  of  discernment.  The  translation  of  a  belief  function  from  frame  0^  to 
frame  0B  using  the  compatibility  relation  0a, B  is  defined  by 

.  (12) 

CAt^siAt:)  =  Bj 

Ak  C  0A,  Bj  C  0B 


where  CA^B{Ak)  -  {6j|(a,-,6j)  €  0a, B,  a.'  €  Ak). 

Several  other  evidential  operations  have  been  defined  and  are  described  elsewdiere  [12]. 

We  now  describe  a  new  evidential  operation  for  making  decisions.  Its  operation  is  anal¬ 
ogous  to  the  evaluation  of  a  choice  node  in  probabilistic  decision  analysis,  except  that  it  is 
defined  for  belief  functions  and  substitutes  the  notion  of  evidential  expected  utility  for  the 
probabilistic  expected  utility. 
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•  Decision  —  to  choose  an  action  based  on  a  body  of  evidence  representing  the  states 
of  nature  believed  at  the  time  of  the  decision  and  a  body  of  evidence  representing  the 
beliefs  resulting  from  any  particular  decision.  Let  represent  the  beliefs  in  frame 
0^  about  the  state  of  nature  at  the  time  a  decision  is  to  be  made.  Let  frame  0£)  be 
the  possible  actions  that  can  be  taken.  Let  D)  represent  the  beliefs  over 

the  utility  frame  Qu  tliat  result  from  making  decision  D  when  A  is  true.  The  mass 
function  representing  the  best  policy  is 

r  7ne(>l)  A  E{mQ^{U\A,D))  >  E{mQ^{U\A,D,)), 
me,^Q^{A,D)  =  VA  e  0^,  A  ^  D  (13) 

[  0  otherwise  . 

Ties  are  broken  arbitrarily.  The  optimal  policy  computed  by  the  decision  node  is  given 
by  the  focal  elements  {A,D)  in  7nG^jx©D  (^5  Do)  ^  most  precise 

statement  known  to  be  true,  then  the  best  decision  is  D. 


Figure  9;  Evidential  Decision  Analysis  of  the  Second  Oil  Drilling  Example 

Within  Gister,  a  decision  node  is  represented  by  a  scjuare.  An  evidence  node  leading  into 
the  decision  node  represents  what  would  be  known  at  the  time  a  decision  is  to  be  made.  The 
output  of  the  decision  node  is  the  optimal  policy  (as  defined  above),  and  is  represented  as 
a  belief  function  over  the  cross-product  frame  of  states  of  nature  and  alternative  decisions 
(0^  X  0d).  That  belief  function  is  then  available  to  other  evidential  reasoning  operations: 
it  may  be  discounted,  translated  to  a  dependent  frame,  fused  with  additional  evidence  that 
would  only  be  available  after  the  decision  is  taken,  etc.  With  this  definition,  a  decision  node 
represents  a  primitive  operation  that  can  be  included  in  the  data  flow  represented  by  an 
analysis.  Figure  9  illustrates  the  analysis  that  was  constructed  for  representing  the  second 
oil-drilling  example  within  Gister. 

The  optimal  policies  computed  by  the  Test?  and  Drill?  nodes  are  summarized  below 
(assuming  that  p  —  0.5).  The  result  is  identical  to  the  strategy  computed  using  a  decision 
tree  as  can  be  verified  by  comparison  with  Figure  7. 
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Rho=0.5 


Decision  Test?:  Expected  value:  27500 

1.0  =  in((  No  Test  Test  ))  —  Test 

E(x)  =  27500  EVI  =  [  5000  50000] 

E(x)  =  500  EVI  =  [  -34000  35000] 

Decision  Drill?:  Expected  value:  27500  Rho= 

0.50  =  m((  Test  &  Red  ))  —  No  Drill 

E(x)  =  -10000  EVI  =  [  -10000  -lOOOO] 

E(x)  =  -80000  EVI  =  [  -80000  -80000] 

0.20  =  m((  Test  &  Yellovf  ))  —  Drill 

E(x)  =  -10000  EVI  =  [  -10000  -lOOOO] 

E(x)  =  20000  EVI  =  [  -80000  40000] 

0.30  =  m((  Test  &  Green  ))  --  Drill 

E(x)  =  115000  EVI  =  [  40000  190000] 

E(x)  =  -10000  EVI  =  [  -10000  -lOOOO] 


Test 
No  Test 

0.5 

No  Drill 
Drill 


No  Drill 
No  Drill 


Drill 
No  Drill 
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